Alignment of Eye Movements and Spoken Language for Semantic Image Understanding
نویسندگان
چکیده
Extracting meaning from images is a challenging task that has generated much interest in recent years. In domains such as medicine, image understanding requires special expertise. Experts’ eye movements can act as pointers to important image regions, while their accompanying spoken language descriptions, informed by their knowledge and experience, call attention to the concepts and features associated with those regions. In this paper, we apply an unsupervised alignment technique, widely used in machine translation to align parallel corpora, to align observers’ eye movements with the verbal narrations they produce while examining an image. The resulting alignments can then be used to create a database of low-level image features and high-level semantic annotations corresponding to perceptually important image regions. Such a database can in turn be used to automatically annotate new images. Initial results demonstrate the feasibility of a framework that draws on recognized bitext alignment algorithms for performing unsupervised automatic semantic annotation of image regions. Planned enhancements to the methods are also discussed.
منابع مشابه
Bilinguals Show Weaker Lexical Access During Spoken Sentence Comprehension.
When bilinguals process written language, they show delays in accessing lexical items relative to monolinguals. The present study investigated whether this effect extended to spoken language comprehension, examining the processing of sentences with either low or high semantic constraint in both first and second languages. English-German bilinguals, German-English bilinguals and English monoling...
متن کاملBilingual Lexical Access During Comprehension 1 Running Head: BILINGUAL LEXICAL ACCESS DURING COMPREHENSION Bilinguals Show Weaker Lexical Access During Spoken Sentence Comprehension
When bilinguals process written language, they show delays in accessing lexical items relative to monolinguals. The present study investigated whether this effect extended to spoken language comprehension, examining the processing of sentences with either low or high semantic constraint in both first and second languages. English-German bilinguals, German-English bilinguals and English monoling...
متن کاملMeaningfulness of Religious Language in the Light of Conceptual Metaphorical Use of Image Schema: A Cognitive Semantic Approach
According to modern religious studies, religions are rooted in certain metaphorical representations, so they are metaphorical in nature. This article aims to show, first, how conceptual metaphors employ image schemas to make our language meaningful, and then to assert that image-schematic structure of religious expressions, by which religious metaphors conceptualize abstract meanings, is the ba...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملWord meaning and the control of eye fixation: semantic competitor effects and the visual world paradigm.
When participants are presented simultaneously with spoken language and a visual display depicting objects to which that language refers, participants spontaneously fixate the visual referents of the words being heard [Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language: A new methodology for the real-time investigation of speech perception, memory, and language ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015